AITopics | calibration network

Collaborating Authors

calibration network

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LLM-Rubric: A Multidimensional, Calibrated Approach to Automated Evaluation of Natural Language Texts

Hashemi, Helia, Eisner, Jason, Rosset, Corby, Van Durme, Benjamin, Kedzie, Chris

arXiv.org Artificial IntelligenceDec-30-2024

This paper introduces a framework for the automated evaluation of natural language texts. A manually constructed rubric describes how to assess multiple dimensions of interest. To evaluate a text, a large language model (LLM) is prompted with each rubric question and produces a distribution over potential responses. The LLM predictions often fail to agree well with human judges -- indeed, the humans do not fully agree with one another. However, the multiple LLM distributions can be $\textit{combined}$ to $\textit{predict}$ each human judge's annotations on all questions, including a summary question that assesses overall quality or relevance. LLM-Rubric accomplishes this by training a small feed-forward neural network that includes both judge-specific and judge-independent parameters. When evaluating dialogue systems in a human-AI information-seeking task, we find that LLM-Rubric with 9 questions (assessing dimensions such as naturalness, conciseness, and citation quality) predicts human judges' assessment of overall user satisfaction, on a scale of 1--4, with RMS error $< 0.5$, a $2\times$ improvement over the uncalibrated baseline.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2024.acl-long.745

2501.00274

Country:

North America > United States (1.00)
Europe (0.92)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine (1.00)
Education > Educational Setting (0.67)
Education > Curriculum > Subject-Specific Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Computational Modeling of Artistic Inspiration: A Framework for Predicting Aesthetic Preferences in Lyrical Lines Using Linguistic and Stylistic Features

Sahu, Gaurav, Vechtomova, Olga

arXiv.org Artificial IntelligenceOct-3-2024

Artistic inspiration remains one of the least understood aspects of the creative process. It plays a crucial role in producing works that resonate deeply with audiences, but the complexity and unpredictability of aesthetic stimuli that evoke inspiration have eluded systematic study. This work proposes a novel framework for computationally modeling artistic preferences in different individuals through key linguistic and stylistic properties, with a focus on lyrical content. In addition to the framework, we introduce \textit{EvocativeLines}, a dataset of annotated lyric lines, categorized as either "inspiring" or "not inspiring," to facilitate the evaluation of our framework across diverse preference profiles. Our computational model leverages the proposed linguistic and poetic features and applies a calibration network on top of it to accurately forecast artistic preferences among different creative individuals. Our experiments demonstrate that our framework outperforms an out-of-the-box LLaMA-3-70b, a state-of-the-art open-source language model, by nearly 18 points. Overall, this work contributes an interpretable and flexible framework that can be adapted to analyze any type of artistic preferences that are inherently subjective across a wide spectrum of skill levels.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2410.02881

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York (0.04)
North America > Mexico > Mexico City > Mexico City (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment (0.67)
Media > Music (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Mitigating Biases of Large Language Models in Stance Detection with Calibration

Li, Ang, Zhao, Jingqian, Liang, Bin, Gui, Lin, Wang, Hui, Zeng, Xi, Liang, Xingwei, Wong, Kam-Fai, Xu, Ruifeng

arXiv.org Artificial IntelligenceJun-16-2024

Large language models (LLMs) have achieved remarkable progress in many natural language processing tasks. However, our experiment reveals that, in stance detection tasks, LLMs may generate biased stances due to sentiment-stance spurious correlations and preference towards certain individuals and topics, thus harming their performance. Therefore, in this paper, we propose to Mitigate Biases of LLMs in stance detection with Calibration (MB-Cal). To be specific, a novel calibration network is devised to calibrate potential bias in the stance prediction of LLMs. Further, to address the challenge of effectively learning bias representations and the difficulty in the generalizability of debiasing, we construct counterfactual augmented data. This approach enhances the calibration network, facilitating the debiasing and out-of-domain generalization. Experimental results on in-target and zero-shot stance detection tasks show that the proposed MB-Cal can effectively mitigate biases of LLMs, achieving state-of-the-art results.

dataset, detection, stance detection, (14 more...)

arXiv.org Artificial Intelligence

2402.14296

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > Singapore (0.04)
Asia > China > Hong Kong (0.04)
(13 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Government > Regional Government > North America Government > United States Government (0.94)
Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)

Add feedback

MetaGait: Learning to Learn an Omni Sample Adaptive Representation for Gait Recognition

Dou, Huanzhang, Zhang, Pengyi, Su, Wei, Yu, Yunlong, Li, Xi

arXiv.org Artificial IntelligenceJun-6-2023

Gait recognition, which aims at identifying individuals by their walking patterns, has recently drawn increasing research attention. However, gait recognition still suffers from the conflicts between the limited binary visual clues of the silhouette and numerous covariates with diverse scales, which brings challenges to the model's adaptiveness. In this paper, we address this conflict by developing a novel MetaGait that learns to learn an omni sample adaptive representation. Towards this goal, MetaGait injects meta-knowledge, which could guide the model to perceive sample-specific properties, into the calibration network of the attention mechanism to improve the adaptiveness from the omni-scale, omni-dimension, and omni-process perspectives. Specifically, we leverage the meta-knowledge across the entire process, where Meta Triple Attention and Meta Temporal Pooling are presented respectively to adaptively capture omni-scale dependency from spatial/channel/temporal dimensions simultaneously and to adaptively aggregate temporal information through integrating the merits of three complementary temporal aggregation methods. Extensive experiments demonstrate the state-of-the-art performance of the proposed MetaGait. On CASIA-B, we achieve rank-1 accuracy of 98.7%, 96.0%, and 89.3% under three conditions, respectively. On OU-MVLP, we achieve rank-1 accuracy of 92.4%.

machine learning, pattern recognition, recognition, (15 more...)

arXiv.org Artificial Intelligence

2306.03445

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Zhejiang Province (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Robust real-time aircraft detection with multi-task cascaded calibration networks

#artificialintelligenceJan-26-2022, 09:35:06 GMT

Aircraft detection is notoriously challenging owing to the orientation and size variations of aircraft objects. Existing detection pipelines compromise with efficiency or accuracy to deal with the large visual variations. We present a novel cascaded framework that joins object detection and orientation prediction through multi-task learning. The cascaded framework consists of three stages and operates in a coarse-to-fine manner. Each stage simultaneously rejects false targets, regresses the locations of object candidates, and calibrates the orientations of the candidates to upright gradually.

calibration network, representation, robust real-time aircraft detection, (3 more...)

#artificialintelligence

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.40)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Architecture > Real Time Systems (0.59)

Add feedback